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tecting errors in a computer system including the 
steps of executing sequences of instructions of 
a software program (12) on each of a reference 
system (16) and a test system (15). detecting and 
recording state of the reference system (16) and 
the test system (15) at comparable points in the 
execution of the program (12), and comparing 
the detected state of the reference system (16) 
and the test system (15) at selectable comparable 
points in the sequence of instmctions including 
the end of the sequence of instructions. In a par* 
ticular embodiment, the execution of portions of 
the sequence of instructions between selectable 
comparable points on each of the reference sys- 
tem (16) and the test system (15) is automatically 
replayed if a difference in the compared state of 
the systems is detected. 
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METHOD AND APPARATUS FOR CORRECTING ERRORS IN COMPUTER 
SYSTEMS 

5 BACKGROUND OF THE INVENTION 
Field Of The Invention 

This invention relates to computer systems and, more particularly, to systems 
for rapidly and accurately debugging computer systems. 

History Of The Prior Art 

10 It is often desirable to detect errors in a computer process or a computer 
system. To accomplish this, one typically tests the behavior of the system 
being designed (the "test system") against models of a system (the "reference 
system") which are known to function correctly. If the behaviors of the test 
and reference systems agree, the test system is presumed to be functioning 

15 correctly. If the behaviors of the systems differ, an error has been detected. 

There are many ways to compare the behavior of two systems. One is to 
cause each system to generate a stream of internal or external events such as 
procedure calls, state transitions, or bus signals, and compare the two 
streams. However, some event streams may be too difficult to capture to 
20 make comparison of the event streams practical. Other event streams may be 
too expensive to capture. Yet other event streams may be too coarse-grained 
to provide useful localization of the errors. In some event streams, the events 
may depend too much on the implementations of the systems and thus be 
incomparable. 

25 Another comparison technique is to run both systems, record the state of the 
systems while running, and then compare the state of the systems. State 
comparison may suffer the same kinds of problems as the comparison of 
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events streams. Some state may be expensive or impossible to access or may 
depend too much on implementation details of the two systems. 

Different methods may be used to accomplish comparison of either state or 
events. Traditionally, where the reference system is informal or is a non- 
5 executable, the method has been an ad hoc search for various test state or 
events that are inconsistent with state or events produced by the reference. 
This method, of course, offers significant chance to miss important problems 
which may reside in the test system. 

Where the reference is an executable specification, it would be desirable to be 
10 able to simply run the two systems simultaneously and compare the state at 
any instant to detect differences. The problems delineated above make this 
more difficult. 

The exact comparison of system state after each step is not generally 
possible. First, the test system and reference system necessarily differ in 

15 some manner (such as function, implementation, or performance) so that the 
total state of the systems is not comparable at all times. With some systems, 
the total state may never be comparable. For example, it is often desirable to 
compare the state of a test system which is significantly different that the 
reference system as when porting an application program from one type of 

20 processor running a first operating system to another type of processor 
running a second operating system. What is necessary is that the test 
system produce the same final results as the reference system. Since the 
processors and operating systems differ, the operations being performed at 
any instant in running the two systems (and thus the state) will probably 

25 differ. 
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Second, it may be prohibitively expensive to repeatedly compare the state 
after each operation by each system. In order to compare the state of two 
systems, all of memory (possibly including second level storage) and all of the 
registers of the two systems must be compared to see if they are the same. 
5 The sheer size of the state comparison makes this expensive. 

Finally, some state information may be unavailable, for example state in 
unreadable processor registers or state that is represented implicitly in one of 
the systems. Therefore, the comparison must often be selective. 

Even though the comparison of state at all points during execution is 
10 difficult, it is possible to select points at which the results of the test and 

reference system should be the same. At these points, a comparison of state 
may be taken. However, to date no automated or rapid method for comparing 
systems has been devised. 

It is desirable to provide a method and apparatus for rapidly detecting errors 
15 in a test system. 

More particularly, it is desirable to provide a method and apparatus for 
detecting errors in a computer process or a computer system. 

Summary Of The Invention 

The present invention is realized by a computer implemented process for 
20 detecting errors in computer systems comprising the steps of executing 

sequences of instructions of a softwaire program on each of a reference system 
and a test system, detecting and recording state of the reference system and 
the test system at comparable points in the execution of the programi, 
comparing the detected state of the reference system and the test system at 
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selectable comparable points in the sequence of instructions including the 
end of the sequence of instructions. 

In a particular embodiment, the execution of portions of the sequence of 
instructions between selectable comparable points on each of the reference 
5 system and the test system is automatically replayed if a difference in 
compared state of the systems is detected. 

These and other features of the invention will be better understood by 
reference to the detailed description which follows taken together with the 
drawings in which like elements are referred to by like designations 
10 throughout the several views. 

Brief Description Of The Drawings 

Figure 1 is a block diagram of a system designed in accordance with the 
present invention. 

Figure 2 is a block diagram of one embodiment of a system designed in 
15 accordance with the present invention. 

Figure 3 is a diagram illustrating a process in accordance with the present 
invention. 

Detailed Description 

Referring now to Figure 1 , there is illustrated a block diagram of apparatus 
20 constructed in accordance with the present invention. The present invention 
utilizes a control mechanism 10 which receives instructions from a program 
12 it is desired to execute on a test system 15 and one or more reference 
systems 16 utilizing input/output devices or models 18. The program 12 
may be a hardware or software program or process or any variation of those. 
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The test and reference systems 15 and 16 may each be hardware, software, or 
some combination of each. The control mechanism 10 controls the execution 
of the program 12 by both the test system 15 and the reference system 16. 
The program 12, although shown as a separate entity in the figure, may be 
embedded in the systems or supplied from outside the systems. The program 
12 may have different representations for each of the systems but must at 
some level compute the same results. The control mechanism 10 may be any 
control that provides a means to run the program 1 2 in increments and to 
read or verify the state of the two systems in increments. 

In principle, it would be possible to execute the program 12 by executing the 
program in single steps and comparing the state of the systems at each step. 
However, checking a large amount of state at every step will often be 
prohibitively expensive. In addition, simple stepping may not be possible. 
For example, some results may be computed in one order by one system and 
in a different order on another. Instead, the present invention uses the 
control mechanism 10 to execute several or many steps in the test and 
reference systems. Thus, the control mechanism may execute a known and 
often large number of steps of the program 12 on the test system 15 and then 
repeat the execution of the steps on the reference system 16. Then, state 
comparison may be performed. By executing a number of steps before 
comparison, the cost of comparison is amortized over several steps, and the 
current invention is also able to find errors in systems that do not perform 
identical computations at each step. 

The control mechanism 10 executes a comparison predicate (an expression 
that is to be evaluated) that evaluates to either true or false. The comparison 
predicate need not be embedded in the control 10 and may vary depending 
on the systems under test, on the kinds of errors being analyzed, and on 
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tradeoffs between the performance and accuracy of error localization. In 
addition, the comparison predicate may change during execution. For 
example, if the reference system and the test system are both processors, the 
comparison predicate may frequently perform a quick check of the user- 
5 visible registers of both processors and may also periodically perform a more 
expensive but more comprehensive check of hidden registers and memory. 

It should be noted that each of the test and reference systems may have state 
that is never used in comparison and which is therefore ignored in the 
comparison. Likewise, each system may represent some state in different 

10 ways, such that the comparison predicate or its surrogate must perform some 
state transformation in order to perform the comparison. The notion of 
"state" may be extended to include the state of an event log, in order to take 
advantage of event stream debugging. For example, the predicate may 
compare elements in a time stamped input/ output log, in order to ensure 

15 that the external behavior of both systems is the same. Finally, the predicate 
may compute approximate information, for example a checksum and cyclical 
redundancy check (CRC) of a set of values rather than comparing all of the 
values explicitly. 

The control program 10 controls execution based on a number of premises. 

20 First, it will be recognized that if the same program 12 is correctly executed 

by two systems, then the individual instructions executed by the two systems 
may be different but there are points where specific parts of the states of the 
two systems should be equivalent if the test system and the reference system 
are each functioning correctly to provide the same results. These points are 

25 referred to in this specification as "comparable points in virtual time" or 
simply "comparable points." For example, a reference system may be a 
simple processor (such as a RISC processor) combined with a simulator and 
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which is used to emulate a complex processor (such as a CISC processor), 
while a test system may be such a complex processor. The reference system 
will typically execute several instructions to mimic the behavior of a single 
instruction in the test system. Consequently, the state of the reference 
5 system will only be comparable at times that correspond to whole 
instructions being executed in the test system. 

The present invention is also based on the recognition that, at some level, the 
operations of correctly functioning computer systems are deterministic. 
Thus, when a sequence of instructions has commenced running on a 

10 processor, the results of those instructions are determined by the initial 

machine state except for events external to the sequence of instructions such 
as interrupts and input/ output operations. Only these external operations 
can cause a variation in the manner in which the program executes on the 
system. To eliminate such variations in some embodiments, the behavior of a 

15 single set of device models is shared by the reference system and by the test 
system. This simplifies a variety of state comparison issues. An additional 
benefit is that debugging may be substantially faster with a single set of 
devices than with separate devices for both systems. The present invention 
takes extemal events from the device models 18 and shares them between the 

20 test and reference systems. The control mechanism 10 ensures that the 

external events are presented to the test and reference systems at the same 
comparable points in virtual time, so that extemal events will have the same 
effects. Consequently, the two systems 15 and 16 are theoretically executing 
programs producing identical results with identical external events. However, 

25 it should be recognized that a large number of comparable points may exist 
between any two comparable points at which extemal events occur. 
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The recognition tiiat with correctiy functioning computer systems once a 
sequence of instructions has commenced running on a processor the results 
of those instructions are determined except for events external to the 
sequence of instructions and the elimination of asynchronous external 

5 events, allows comparisons at comparable points in virtual time during 
execution of the program by the two systems. The control mechanism 10 
periodically compares some portion of the state of the reference system and 
test system. When parts of the state of the systems vary (are incompatible 
according to the comparison predicate) at any comparable point, an error 

10 (called a divergence) has been encountered. The divergence may have 

occurred at any point since the last state comparison. Detecting a divergence 
utilizing comparable points at which sufficient state is reasonably available to 
detect system errors provides significant advances over prior art error 
detection systems. 

15 The current invention can also automatically narrow the scope of the error by 
searching over virtual time since the last state comparison that indicated 
identical state. To accomplish this, the execution of the program is halted at 
the earliest comparable point at which a deviation of state is detected. As a 
result of being able to record and replay external events which may be shared 

20 during execution and replay, it is possible to "return to an earlier comparable 
point" and then execute forward again and get the same results. This allows 
"replaying" a portion of the program which has been executed and in which 
an error occurred. To support repla3dng, the control mechanism 10 
periodically checkpoints or records the state of the test and reference systems 

25 at preselected comparable points during execution of the program by the 
systems. The operation of the systems may then be returned to an older 
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checkpoint when a deviation is detected and the program 12 then executed 
forward to the desired comparable point. 

A checkpoint need not capture all system state, but it should provide 
sufficient state to restart execution. Checkpointing more state tends to 

5 produce better debugging accuracy. Checkpointing may be performed with 
any period from never to every step; changing the checkpoint interval affects 
the performance of both forward execution and replay. Replay also relies on a 
log of external events. When execution is restarted for a replay, the external 
events are taken from the log in order to ensure deterministic behavior on 

10 replay. 

Many search techniques are possible; the best choice may vary depending on 
the relative speeds of forward execution and replay, state comparison, and 
the like. Moreover, when an error is detected, the control mechanism 10 may 
switch to using a more comprehensive or accurate state comparison 

15 predicate. In one embodiment, the control mechanism 10 may perform a 

binary search in virtual time as follows. The control mechanism 10 may set 
the beginning of the replay to a previous comparable point at which a 
successful state comparison occurred, probably the last comparison at a 
strictest level. The large number of comparable points which may exist 

20 between any two comparable points at which external events occur allows 
relatively fine grained isolation of errors after a deviation is detected. The 
control mechanism 10 may set the end of the replay to the comparable point 
at which the failed state comparison occurred and then execute the replay of 
the program to an intermediate comparable point half way between the 

25 beginning and end. The state of the systems is then compared again at this 
halfway point. If the state comparison fails at the halfway point, the end of 
the replay is set to the intermediate comparable point and searching 
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continues with replay beginning at the same beginning point and continuing 
to a second intermediate comparable point half way between the beginning 
and new end point. The state of the systems is then compared again at the 
halfway point. If state comparison succeeds, the beginning is set to the 
5 intermediate comparable point and searching continues. In this manner, the 
search for an error is rapidly narrowed. 

There are various circumstances that can terminate the search. For example, 
the control mechanism 10 may terminate the divergence search when the 
search range becomes so small that the range cannot be reduced further. At 
10 that point the divergence is considered "found." The smallest range depends 
on the test and reference systems and on the program being executed. The 
error may be reported to the user. 

The manner in which the control mechanism 10 isolates the test and reference 
systems from asynchronous external events by presenting the same external 

15 events to each of the test and reference systems offers a number of 

advantages in addition to the ability to assure consistency and provide replay 
execution of the program 12. For example, over arbitrary segments of the 
execution of the program 12, the control mechanism 10 may selectively vary 
the rate and distribution of external events over arbitrary segments of the 

20 execution of the program 12. This allows the control mechanism to generate 
test situations such as unusually high arrival rates for external events. 

The control mechanism 10 may also perform "perturbation analysis" by 
repeating a segment of execution which has provided a successful state 
comparison. Each time the execution is repeated, the initial state or delivery 
25 of external events may be altered but in such a matter that the final outcome 
of the execution should not change. If the final state does change from run to 
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run, an error has been detected; and the error isolation techniques mentioned 
above can be used to narrow the location of the error in time by comparing 
against reference systems. For example, a critical section of the program 12 
may be run several times; and each time the control mechanism 10 may 
5 deliver an interrupt at a different instruction in the sequence or change the 
initial content of caches in the test or reference system. 

The invention is adapted to function with systems of different levels of 
complexity. For example, the test system may utilize register renaming or 
similar techniques to improve performance. Some data may be in a hardware 
10 or software cache in one system and in primary memory in the other, or 

represented implicitly in one system (requiring reconstruction of the data in 
question in order to effect state comparison). In order to compare state, the 
comparison predicate calls upon surrogates in the reference system and the 
test system in order to compare state by name rather than by location. 

15 There are many embodiments of the present invention. In one embodiment, 
the reference system 16 is a straightforward decode-and-dispatch interpreter 
while the test system executes the same input programs using dynamic 
compilation of virtual machine code to native machine code. In order to 
maintain comparable points in virtual time, the reference system increments 

20 virtual time for each instruction fetched, while the test system increments 
virtual time using instrumentation code inserted in each dynamically- 
compiled block. The comparison predicate compares only state for the virtual 
machine that is being executed by each system. 

One particularly useful embodiment of the present invention shown in Figure 
25 2 utilizes a software control program 20 which receives instructions from a 
program 12 it is desired to execute on the two different systems via a 
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debugger program 14. The debugger program 14 may be selected as any 
useful program adapted to provide means by which the program 12 may be 
run in increments iand the state of the system running the program tested at 
desired increments. The control program 20 controls the operations of both a 
5 reference system 16 and the test system 15 in executing the program 12. 

In this embodiment of the invention, the test system 15 simulates a processor 
executing a translation program 13 which dynamically translates instructions 
of the program 12 into series of instructions which the processor is capable of 
executing to accomplish results of the instructions of the program 12. In 
10 executing the program 12 using the debugger 14, the control program 20 
causes the system 16 to execute a known and often large number of the 
original instructions of the program 12 and then repeats the execution of 
these instructions using the system 15. 

The control program 20 controls the entire execution by the two systems 15 
15 and 16. In the Figure 2 diagram illustrating apparatus for implementing the 
present invention, the debugger software 14 receives a sequence of 
instructions from the program 12 which is to known to run on a particular 
computer system such as a computer utilizing an Intel X86 microprocessor. 
A control program 20 designed in accordance with the present invention 
20 provides a typical computer system interface to the debugger software 14 and 
controls the application of the instructions to the test and reference systems 
15 and 16. The control program 20 presents the same interface to the 
debugger software 14 as do the systems 15 and 16 to the control program 20. 
The control program 20 also interposes itself between the systems 15 and 16 
25 and their devices 18 in order to provide logging and replay of external events. 
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There are many ways of maintaining comparable points of virtual time. In 
one embodiment, the program 12 is translated by a translation program 13 to 
instructions for an instruction set which executes on the test system 15. The 
test system 15 which includes a microprocessor combined with this 
translation program 13 is designed to execute programs of an instruction set 
originally designed to execute on a microprocessor defined by the reference 
system 16. In order to allow the determination of comparable points of 
virtual time at which the state of the original program 12 and the instructions 
translated by the translation program 13 are the same, the translation 
program 13 maps the points in the translation which coincide with the end of 
each instruction of the program 12 and which thus indicate the comparable 
points which may be used in comparing the two simulation models 1 5 and 
16. The control program 20 looks for these comparable points in determining 
when and where to start and stop executing the sequences of instructions on 
the systems 15 and 16. 

In addition to using this time domain mapping, the control program 20 
provides means for detecting the mapping of the address space of the two 
simulation models 15 and 16 at the comparable points in the execution of the 
program 12 by the two systems 15 and 16. For example, one or the other of 
the systems may utilize register renaming or similar techniques to accelerate 
processing. In order to test the consistency of state using the two systems 15 
and 16, the registers actually holding data which should be the same must be 
determined. Similarly, memory data may be stored in a processor cache in 
one system and not yet stored in memory, while in the other system, the 
same data may have been stored in main memory. Again, the correct 
memory must be compared to obtain useful results. In one embodiment, the 
control program 20 interrogates the systems under its control in terms of 
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abstract register or memory names. Surrogates in the test and reference 
systems use private mappings from abstract names to storage locations to 
provide the desired data. 

Once provided with these two mappings, a selected sequence of instructions 

5 from the program 12 are furnished by the debugger software 14 to the control 
program 20 for execution. The debugger software 14 typically provides the 
ability to execute a program in sequences determined by a user and to 
determine state of the executing model at selected times but has no ability to 
accomplish comparisons between the execution of a program on one system 

10 or another. The debugger software 14 furnishes the sequence of instructions 
to the control program 20. The control program 20 furnishes the sequence of 
instructions to the reference system 16 which runs the sequence of 
instructions. The control program 20 records the state of the system 16 at 
the beginning of the sequence before commencing the execution of the 

15 sequence on the system 16. As the sequence of instructions is executed on 
the system, the control program 20 records and time stamps each external 
event. Thus, the control program 20 records each input/ output operation 
including the commands, addresses, and data transferred to any simulated 
input/ output devices 18 and the responses returned to the system 16 from 

20 the simulated devices 18. The control program 20 also records each 
exception to which the system 16 responds during the sequence. 

Next, the control program 20 executes the same sequence of instructions on 
the test system 15 to the ending comparable point of the sequence. In the 
execution of the sequence of instructions, the external operations 
25 {input/output or exception) which have been recorded for the reference 

system 16 are utilized from the log of results obtained from execution of the 
sequence by the system 16. Thus, if the system 16 caused an input/output 
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operation to occur, the externally generated return from that operation is 
furnished to the system 15. In this manner, the return values are identical to 
those fumished to the reference system 16 so that the result within the test 
system 15 should be the same as the result within the system 16. Similarly, 

5 if the system 16 received an interrupt, the interrupt is recorded by the control 
program 20 and is later furnished to the test system 15 so that its response 
should be the same as the response of the reference system 16 if the test 
system 15 is correct. Using the external events logged in executing the 
program 12 by the system 16 eliminates the work of simulating devices which 

10 will function correctly with the second model, assures that the external 

events affecting the simulations are identical, and thus eliminates the chance 
that the program execution will vary because of asynchronous external 
events. 

The sequence is completed in this manner on the system 15. Then, the state 
15 of the two systems is compared. If the state of the two systems is the same at 
the end of the sequence, a second sequence of instruction from the program 
12 is furnished to the debugger 14 for execution on the two systems 15 and 
16. 

If the execution of any sequence of instructions from the program 12 
20 produces different state in the test and reference systems at the comparable, 
then the program 12 is replayed from a previous checkpointed comparable 
point in the execution at which the comparison produced identical state. In 
the specific embodiment described herein, the process is rerun in a binary 
search mode in the manner explained above. In replaying the execution of a 
25 preceding sequence of instructions, it is useful to utilize the log of external 
events previously recorded during execution by the reference system to 
eliminate any variations in external events for execution on both systems. If 
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a first half of the sequence produces the same state at a middle comparable 
point, a first half of the second half of the sequence of instructions is replayed 
for the test and reference systems. This binary testing of portions of the 
sequence continues so that the point of divergence of the two simulations is 
5 quickly isolated. 

As will be seen, this binary testing for errors in the execution of a sequence of 
instructions provides very rapid isolation of the point of error. It is not 
necessary that a search to isolate errors be a binary search. For example, if 
executing the program is less expensive in time than is reversing the program 

10 to find a most recent comparable point at which state was consistent, it may 
be faster to replay smaller portions of the program and thereby reduce the 
number of times required to reverse the program in order to isolate errors. In 
any case, the ability to run the two models essentially in parallel through a 
long sequence of instructions which may vary in length under control of the 

15 operator or the control program allows very rapid and accurate debugging of 
the slave simulation model without the tedious step by step process required 
by the prior art. 

Although the present invention has been described in terms of a preferred 
embodiment, it will be appreciated that various modifications and alterations 
20 might be made by those skilled in the art without departing from the spirit 
and scope of the invention. The invention should therefore be measured in 
terms of the claims which follow. 

What Is Claimed Is: 
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1 Claim 1 . A system for correcting errors in a computer system comprising; 

2 a test system in which errors are to be corrected; 

3 a reference system; and 

4 a control mechanism for executing a program on the reference system and 

5 the test system, 

6 the control mechanism including: 

7 means for executing sequences of instructions of the program on 

8 each of the reference system and the test system, 

9 means for detecting and recording state of the reference system 

10 and the test system at comparable points in the execution of the 

11 program, and 

12 means for comparing the detected state of the reference system 

13 and the test system at selectable comparable points in the 

14 sequence of instructions including the end of the sequence of 

15 instructions. 

1 Claim 2. A system as claimed in Claim 1 which further comprises means 

2 for replaying the execution of portions between selectable comparable points 

3 of the sequence of instructions on each of the reference system and the test 

4 system if a difference in compared state of the systems is detected. 

1 Claim 3. A system as claimed in Claim 2 in which the means for replaying 

2 the execution of portions between selectable comparable points of the 

3 sequence of instructions on each of the reference system and the test system 

4 if a difference in compared state of the systems is detected responds to 
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5 detection of a difference in compared state to reset a portion of the sequence 

6 of instruction to be replayed. 

1 Claim 4. A system as claimed in Claim 3 in which the means for replaying 

2 the execution of portions between selectable comparable points of the 

3 sequence of instructions on each of the reference system and the test system 

4 if a difference in compared state of the systems is detected conducts a search 

5 between a last comparable point at which state correctly compared in the test 

6 and reference systems and the comparable point at which the difference in 

7 compared state was detected. 

1 Claim 5. A system as claimed in Claim 4 in which the search conducted is 

2 a binary search. 

1 Claim 6. A system as claimed in Claim 1: 

2 in which the control mechanism further comprises means for providing 

3 external events to the reference system, 

4 which further comprises means for recording external events provided by the 

5 means for providing external events to the reference system; and 

6 in which the control mechanism comprises means for utilizing recorded 

7 external events of the reference system as external events for the test system 

8 in executing sequences of instructions, 

1 Claim 7. A system as claimed in Claim 6 in which the control mechanism 

2 comprises means for varying the external events and the comparable points 

3 at which external events are provided. 



wo 98/38575 



PCTAJS98/02673 



19 

1 Claim 8. A system as claimed in Claim 2: 

2 in which the control mechanism further comprises means for providing 

3 external events to the reference system, 

4 which further comprises means for recording external events provided by the 

5 means for providing external events to the reference system; and 

6 in which the control mechanism comprises means for utilizing recorded 

7 external events of the reference system as external events for the reference 

8 and the test system in replaying the execution of sequences of instructions. 

1 Claim 9. A system as claimed in Claim 1 in which the means for detecting 

2 and recording state of the reference system and the test system at 

3 comparable points in the execution of the program detects and records 

4 surrogate values representing state in the reference system and the test 

5 system. 

1 Claim 10. A system as claimed in Claim 1 in which the control mechanism 

2 comprises means for replaying the execution of portions between selectable 

3 comparable points of the sequence of instructions on each of the reference 

4 system and the test system while varying the application of external events. 

1 Claim 11. A system as claimed in Claim 1 in which the means for 

2 comparing the detected state of the reference system and the test system at 

3 selectable comparable points in the sequence of instructions including the 

4 end of the sequence of instructions is capable of selecting different state. 

1 Claim 12. A computer implemented process for detecting errors in computer 

2 systems comprising the steps of: 
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3 executing sequences of instructions of a software progratm on each of a 

4 reference system and a test system, 

5 detecting and recording state of the reference system and the test system at 

6 comparable points in the execution of the program, and 

7 comparing the detected state of the reference system and the test system at 

8 selectable comparable points in the sequence of instructions including the 

9 end of the sequence of instructions. 

1 Claim 13. A computer implemented process as claimed in Claim 12 further 

2 comprising the step of replaying the execution of portions of the sequence of 

3 instructions between selectable comparable points on each of the reference 

4 system and the test system if a difference in compared state of the systems is 

5 detected. 

1 Claim 14. A computer implemented process as claimed in Claim 13 in 

2 which the step of replaying the execution of portions between selectable 

3 comparable points of the sequence of instructions on each of the reference 

4 system and the test system if a difference in compared state of the systems is 

5 detected is accomplished in response to detection of a difference in compared 

6 state. 

1 Claim 15. A computer implemented process as claimed in Claim 13 in 

2 which the step of replaying the execution of portions between selectable 

3 comparable points of the sequence of instructions on each of the reference 

4 system and the test system if a difference in compared state of the systems is 

5 detected includes conducting a search between a last comparable point at 

6 which state correctly compared in the test and reference systems and the 

7 comparable point at which the difference in compared state was detected. 
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1 Claim 16. A computer implemented process as claimed in Claim 13 in 

2 which the step of replaying the execution of portions between selectable 

3 comparable points of the sequence of instructions on each of the reference 

4 system and the test system if a difference in compared state of the systems is 

5 detected includes conducting a binary search between a last comparable 

6 point at which state correctly compared in the test and reference systems and 

7 the comparable point at which the difference in compared state was detected. 

1 Claim 17. A computer implemented process as claimed in Claim 13 in 

2 which the step of replaying the execution of portions between selectable 

3 comparable points of the sequence of instructions on each of the reference 

4 system and the test system if a difference in compared state of the systems is 

5 detected includes conducting a binary search between a last comparable 

6 point at which state correctly compared in the test and reference systems and 

7 the comparable point at which the difference in compared state was detected, 

8 the search technique being selected to enhance searching speed. 

1 Claim 18. A computer implemented process as claimed in Claim 12 in 

2 which the step of detecting and recording state of the reference system and 

3 the test system at comparable points in the execution of the program includes 

4 detecting and recording surrogate values representing state in the reference 

5 system and the test system. 

1 Claim 19. A computer implemented process as claimed in Claim 12 

2 including the further steps of: 

3 providing external events to the reference system, 



4 



5 



recording external events provided by the means for providing external events 
to the reference system; and 



wo 98/38575 



PCT/US98/02673 



22 

6 utilizing recorded external events of the reference system as external events 

7 for the test system in executing sequences of instructions. 

1 Claim 20. A computer implemented process as claimed in Claim 19 which 

2 comprises the further step of varying the external events and the comparable 

3 points at which external events are provided. 

1 Claim 21. A computer implemented process as claimed in Claim 13 which 

2 comprises the further steps of: 

3 providing external events to the reference system, 

4 recording external events provided to the reference system; and 

5 utilizing recorded external events of the reference system as external events 

6 for the reference and the test system in replaying the execution of sequences 

7 of instructions. 

1 Claim 22. A computer implemented process as claimed in Claim 13 in 

2 which the step of replaying the execution of the sequence of portions between 

3 selectable comparable points of the sequence of instructions on each of the 

4 reference system and the test system includes varying the application of 

5 external events. 

1 Claim 23. A computer implemented process as claimed in Claim 12 in 

2 which the step of comparing the detected state of the reference system and 

3 the test system at selectable comparable points in the sequence of 

4 instructions including the end of the sequence of instructions includes 

5 selecting the state to compare. 
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